## Reading layer `Boston_1938_HOLC' from data source 
##   `/Users/blairwong/Documents/Northeastern/PPUA5262_Big_Data/PPUA5262/data/RedliningMaps/Boston_1938_HOLC/Boston_1938_HOLC.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 39 features and 3 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -71.17492 ymin: 42.23203 xmax: -70.98723 ymax: 42.39551
## Geodetic CRS:  NAD83
## Reading layer `Low_to_Moderate_Income_Population_by_Tract' from data source 
##   `/Users/blairwong/Documents/Northeastern/PPUA5262_Big_Data/PPUA5262/data/Low_to_Moderate_Income_Population_by_Tract/Low_to_Moderate_Income_Population_by_Tract.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 73752 features and 13 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -178.2278 ymin: 17.88124 xmax: -65.24423 ymax: 71.39048
## Geodetic CRS:  WGS 84
## Reading layer `Other_Important_Planning_Boundaries_layers' from data source 
##   `/Users/blairwong/Documents/Northeastern/PPUA5262_Big_Data/PPUA5262/data/Other_Important_Planning_Boundaries_layers/Other_Important_Planning_Boundaries_layers.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 16 features and 13 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -71.11156 ymin: 42.30524 xmax: -71.04696 ymax: 42.38484
## Geodetic CRS:  WGS 84
## ℹ Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL.

Executive Summary

The affordable housing crisis disproportionately affects low-income families of color. One of the reasons for the inequity is because of the Homeowners’ Loan Corporation’s redlining map where areas that had a higher proportion of Black residents were given the lowest rating, leading to disinvestment. As wealthier white families moved outside of the city, Black families were forced to stay in their neighborhoods, further perpetuating and entrenching racial and economic segregation. This has lasting consequences today since areas that are predominantly wealthy and white and areas that are predominantly low-income with people of color still align with HOLC’s redlining map. Although the housing crisis affects everyone, the redlining map ensured that the brunt of the consequences would fall on low-income families of color.

As the affordable housing crisis continues to worsen, policymakers must evaluate the existing resources in place to find housing, especially for low-income families. One of the tools that prospective renters and landlords use is Craigslist. The Craigslist dataset consists of rental prices, location, square footage, description of the rental, and listing date. In addition to the existing measures, new measures such as percentage of listings that welcome Section 8 vouchers, body length, missing data (whether the listing is missing location information or square footage), and price per square foot were created. After calculating these new measures, they were then leveraged to construct a listing quality index which measures the quality of a listing from the perspective of a low-income searcher.

Another resource that the government has leveraged to assist low-income households is the Housing Choice Voucher (HCV) program. The HCV program’s goal is to allow low-income families to move into better-resourced areas. During this study using the Craigslist dataset, we find that families with vouchers are still limited to poorer neighborhoods. As the price per square feet increases, the percentage of listings that approve of Section 8 vouchers decreases. We also see that there is a statistically significant difference between the most expensive listings and the least expensive listings in the percentage of listings that accept Section 8 vouchers. This finding is further supported by the distribution of the listing quality index where the highest quality listings are in areas such as West Roxbury, Dorchester, Roxbury, Mattapan, and Hyde Park, further concentrating low-income families in poorer neighborhoods and preventing them from moving into better-resourced areas. In the end, policymakers must reconsider and reevaluate the existing tools available to help low-income families find housing.

Introduction

With the affordable housing crisis affecting millions of households across the nation, Boston is not exempt from the problem. According to Zumper, the average rent for a 1-bedroom apartment in 2023 is $2,700, a 6% increase from 2022 (Zumper 2023). For a two-person household at 80% AMI ($89,750), they would be considered cost-burdened because they spend over 30% of their income on rent (Boston Planning and Development Agency 2023). The numbers are even starker for families at 50% AMI ($56,100) who would spend 58% of their income on an average 1-bedroom apartment in Boston, and families at 30% AMI ($44,900) who would spend 72% of their income (Boston Planning and Development Agency 2023).

High rents not only prevent low to moderate income families from finding affordable homes, but they perpetuate racial and economic segregation because poor families of color are limited to neighborhoods with fewer resources and public goods. Beyond the lack of accessibility, Black renters face discrimination in the housing search process. Based on a study from Suffolk University, “Black renters experienced discrimination by real estate brokers and landlords in 71 percent of cases tested” (Irons 2020). White renters could easily arrange apartment viewings (80% of the time), whereas Black renters could only visit potential apartments 48% of the time.

For voucher holders, Suffolk University found prevalent and blatant discrimination. The goal of the voucher program is to allow low-income families to move into better-resourced communities. However, the discrimination against voucher holders actively keeps families out of richer neighborhoods. Suffolk University reported that “about 40 percent of the time, the housing provider stopped communicating with testers altogether after the testers revealed they intended to use vouchers” (Suffolk University 2020). Furthermore, voucher holders are faced with additional constraints such as tight timelines, existing poor housing conditions, negligent landlords, and potential evictions (DeLuca et. al 2013).

Given Boston’s high rents and low-income renters’ obstacles to finding affordable housing, this study will explore the following key topics:

Data & Methods

Redlining

In addition to the craigslist data, I also utilized the redlining map provided by BARI. The redlining map shows the areas in Boston where the Homeowners’ Loan Corporation (HOLC) deemed certain neighborhoods as risky for investments from 1935 to 1950. Areas with Black families were given the lowest rating and families were not given loans to buy homes in those areas.

Figure 1 shows HOLC’s map of Boston. The only area that had an A rating is in Jamaica Plain which neighbors the affluent city of Brookline. The areas that were given a D rating are predominantly in Roxbury and the South End, as well as parts of South Boston. The neighborhood of Dorchester was primarily given a C rating, with some parts rated as D as well. However, on its own, the redlining map can only share a portion of the story. To understand its impact today, we must also look at Boston’s demographics today.

## Coordinate system already present. Adding new coordinate system, which will
## replace the existing one.

Craigslist Datasets

The data used in this analysis is retrieved from the Boston Area Research Initiative (BARI) who scraped housing listings on Craigslist for Massachusetts from February 2020 until December 2021 (link). The variables within the data consist of listing ID, listing year, listing month, listing day, listing time, retrieved on, body, price, square footage, whether the listing allows pets, address, location, and census tract ID. Because the dataset contains information on all towns and cities across Massachusetts, I filtered the data to census tracts in Boston and in the Greater Boston area, leaving 43,257 listings. I then aggregated the data by census tract and calculated additional variables as described in Table 1.

Table 1. Variables for the aggregated listings
Variable Description
AVG_PRICE The average rental price of listings within the census tract
AVG_SQFT The average square footage of listings within the census tract
PER_YES_SECTION_8 Percentage of listings that approved and welcomed Section 8 voucher holders
PER_NO_SECTION_8 Percentage of listings that do not welcome Section 8 voucher holders
MISSING_DATA The number of listings missing either location or square footage information (or both)
AVG_BODY_STR The average length of the body (description of the listings)
PRICE_PER_SQFT The average price per square foot

Table 2 shows the summary of all the variables in the aggregated dataset. It is important to note that to calculate the PRICE_PER_SQFT, any listings that had an NA value for square footage was removed. In addition, any outliers (if the square footage was over 3,000 square feet or the price was less than $300) were removed.

Based on the data provided in the summary table, the average rent was $2,395.40 and the average square footage was 980.5 square feet. The average length of the body was 159.7 words, and the average price per square foot was $2.6/square feet. In terms of Section 8 vouchers, a majority of listings did not mention vouchers at all. Therefore, the mean is listed at 0%.

Table 2. Summary table of variables from aggregated data

Unique (#) Missing (%) Mean SD Min Median Max
AVG_PRICE 176 0 2395.4 338.3 1522.4 2391.5 3581.7
AVG_SQFT 176 0 980.5 212.4 565.8 974.6 1614.9
PER_YES_SECTION_8 72 0 0.0 0.1 0.0 0.0 0.4
PER_NO_SECTION_8 4 0 0.0 0.0 0.0 0.0 0.0
MISSING_DATA 112 0 160.5 285.4 0.0 39.5 1590.0
AVG_BODY_STR 176 0 159.7 54.1 57.6 148.3 331.9
PRICE_PER_SQFT 176 0 2.6 0.7 1.1 2.4 4.6

Section 8 Vouchers

Diving deeper into the variables related to Section 8, only 3.7% of Craigslist listings overall actively welcomed Section 8 vouchers. The percentage of listings that explicitly turned away voucher holders was very low (6.175e-05). Thus, the overwhelming majority of listings (~96%) did not mention Section 8 at all. Additionally, the distribution of listings that welcome voucher holders is not even throughout the city of Boston. Figure 5 shows that the areas with a higher percentage of voucher approvals are in areas such as Hyde Park, Mattapan, Dorchester, and Roxbury, areas where rent might be lower as well.

To determine if the difference between the percentage of Section 8 approval among rental prices across the city is significant, I first divided all of the census tracts into quartiles based on the price per square foot and then conducted an ANOVA test. The results are shown below. Since the p-value is less than 0.05, the results were statistically significant. After calculating the R^2 value, the quartiles explained 6.36% of the differences.

##                                     Df Sum Sq Mean Sq F value   Pr(>F)    
## as.factor(PRICE_PER_SQFT_QUARTILE)   3 0.1485 0.04948   8.739 1.99e-05 ***
## Residuals                          172 0.9739 0.00566                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Digging deeper with the post-hoc TukeyHSD test, we can see that the only significant difference was between quartile 4 (the highest price per square foot) and quartile 1 (the lowest price per square foot) with the p-value of 0.0087349. Therefore, there are no significant differences between the other quartiles. Figure 5 is a bar graph that shows the differences between the percentage of listings that welcome Section 8 vouchers across the four price per square footage quartiles.

In addition to the ANOVA test, I ran correlations and created a regression model with PER_YES_SECTION_8 as the dependent variable and AVG_PRICE_PER_SQFT as the independent variable. The results of the correlations show that PER_YES_SECTION_8 has a negative correlation of -0.354 with AVG_PRICE_PER_SQFT, which means that as the price per square foot rises, the percentage of listings that welcome Section 8 vouchers decreases. Because the p-value was 5.783e-06, the results are statistically significant. Figure 6 shows the regression plot of price per square foot and percentage of listings that welcome Section 8.

Looking at the regression plot, we can see that a majority of the tracts with a higher percentage of listings that welcome Section 8 vouchers are below $3/square feet. Since a majority of the listings did not mention Section 8, let alone welcome them to begin with, the line remains relatively low (with a y-intercept of -3.540e-01).

## `geom_smooth()` using formula = 'y ~ x'

Listing Quality Index

Finally, to measure the overall quality of listings, I created a new variable called “LISTING_QUALITY_INDEX” based on latent constructs such as whether the price was reasonable, whether the listing contained metadata such as location and square footage, whether the body length was appropriate, and whether the listing mentioned Section 8. A more detailed description of construction of the LISTING_QUALITY_INDEX can be found in the Appendix. The overall goal of the listing quality index is to provide a measure for determining whether a listing is helpful and informative for a low-income renter and whether it could ultimately assist them in securing housing.

The listing quality index is comprised of different scores: PRICE_SCORE, AREA_SQFT_SCORE, LOCATION_SCORE, BODY_SCORE, and SEC_8_SCORE. The scores were then added up to construct the LISTING_QUALITY_INDEX, a process based on Connecticut Data Collaborative and Trinity College Liberal Action Lab’s approach. Table 4 shows a summary of each score and the index.

Unique (#) Missing (%) Mean SD Min Median Max
PRICE_SCORE 4 0 1.6 0.9 0.0 1.0 3.0
AREA_SQFT_SCORE 2 0 0.3 0.5 0.0 0.0 1.0
LOCATION_SCORE 2 0 0.9 0.3 0.0 1.0 1.0
BODY_SCORE 2 0 1.0 0.1 0.0 1.0 1.0
SEC_8_SCORE 3 0 1.0 0.1 0.0 1.0 2.0
LISTING_QUALITY_INDEX 8 0 4.9 1.0 1.0 5.0 8.0

The summary table shows that the distribution of the LISTING_QUALITY_INDEX is relatively normal, with a mean of 5.9. Similarly, when aggregating the data by census tracts, the mean is 6.0. The distribution of the listing quality index across Boston can be seen in Figure 7. Based on the results, we can see that a majority of the listings with higher quality are in West Roxbury, Hyde Park, Mattapan, and Dorchester.

## Reading layer `HOUSING_CHOICE_VOUCHERS_BY_TRACT' from data source 
##   `/Users/blairwong/Documents/Northeastern/PPUA5262_Big_Data/PPUA5262/data/tracts/Housing_Choice_Vouchers_by_Tract/HOUSING_CHOICE_VOUCHERS_BY_TRACT.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 73765 features and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -179.1473 ymin: 17.88133 xmax: 179.7785 ymax: 71.39048
## Geodetic CRS:  WGS 84
## Coordinate system already present. Adding new coordinate system, which will
## replace the existing one.

Discussion

Impact of Redlining on the Boston Housing Market Today

The redlining map in Figure 1 shows which neighborhoods have been divested over the years. Because of this, we can see this reflected in the demographics of Boston where Black families are still more concentrated in previously redlined areas, and white families reside in areas that were not redlined. Comparing the redlining map with the listing quality index map (Figure 7), we can also see the remnants of redlining. Although the neighborhoods further south were not rated as D, they had a C rating which indicated that some immigrants and Black families were already living there. Thus, the area was also likely to be redlined as the maps changed. In the end, the HOLC maps created inequities that have a lasting consequence for today.

Overall Craigslist Rental Scene

With the average price of rent as shown in Table 2 $2,395.40, it is no surprise that there is an acute affordable housing crisis in Boston. While wealthier families may be able afford the higher rents, low to moderate income families are left with few options. For families with vouchers, the search is even more difficult due to the discrimination they face and the limited number of landlords that are willing to take them. Although vouchers were meant to be a people-based solution, they still contribute to racial and economic segregation because voucher-holders are limited in their search. Additionally, the search is even more stressful for them because they have a short time window available to find housing before their voucher expires and they have to enter a years-long waiting list again. With such a low percentage of listings that actively welcome Section 8 vouchers (3.7%), and only in areas with existing Black communities, the low-income families with vouchers are constrained by time, money, and location, thus further perpetuating segregation.

Section 8 Vouchers in the Craigslist Data

The TukeyHSD test revealed that the only significant difference in percentage of listings within a census tract that welcomes Section 8 vouchers was between quartile 1 (the least expensive price per square foot) and quartile 4 (the most expensive price per square foot). Unsurprisingly, the neighborhoods in the 4th quartile consist of Back Bay, South End, South Boston, and neighborhoods that are adjacent to Brookline as seen in Figure 8. The neighborhoods in the first quartile are predominantly in neighborhoods such as Roxbury, Mattapan, parts of Dorchester, parts of Brighton, and East Boston.

The Housing Choice Voucher (HCV) program was created after public housing was deemed a failure. The purpose of the HCV program was to leverage the private market to enable low-income families to move into better-resourced neighborhoods. However, we can see that there is a significant difference in the percentage of listings that approve vouchers between tracts at the lowest and highest ends of the price per square foot spectrum. This ultimately has implications for low-income households because they are limited to the lowest quartile, places with a low price per square foot. Moreover, the fact that low-income families do not have access to certain neighborhoods contributes to inequities in housing.

The differences in percentage of listings that welcome Section 8 vouchers across price per square foot quartiles can also be seen in a regression model where price per square foot is the independent variable and the percentage of Section 8 listings is the dependent variable. Figure 6 reveals that as the price per square foot increases, the percentage of listings that approve Section 8 vouchers decreases. This perpetuates racial and economic segregation because it prevents poor families from moving to wealthier areas. Furthermore, it is important to note that a large portion of listings did not mention Section 8 vouchers at all which leaves low-income families in uncertainty and adds more time and confusion to their search

## Coordinate system already present. Adding new coordinate system, which will
## replace the existing one.

Listing Quality Index

Finally, the listing quality index is another tool that shows the inequities between census tracts. While the distribution is somewhat normal (as seen in Table 4), the distribution of listing quality across Boston is not distributed equally as seen in Figure 7. This means that not only are voucher holders limited to certain neighborhoods because of price, but also because of the quality of the listings on Craigslist. As families search for housing on Craigslist, they are constrained to specific neighborhoods due to the quality of the listing itself. Ultimately, this shows that Craigslist is not a tool meant for low-income families, especially those who are voucher holders. Additionally, landlords who are posting their listings on Craigslist likely do not have low-income families in mind.

Conclusion

Revisiting the key topics of the paper as stated in the introduction, we can patterns of inequities in the Craigslist dataset. The HOLC’s maps spurred segregation and divestment in neighborhoods with a predominantly Black population. Over the years, the impact of the map ensured that Black families were constrained to specific neighborhoods and were unable to accrue wealth, as seen through the demographic data and maps. Areas with a lower percentage of low to moderate income families were predominantly white, while areas with a higher percentage of low to moderate income families were predominantly Black.

One of the solutions to address the patterns of segregation and inequities was the Housing Choice Voucher program which was meant to enable low-income families to move into higher income areas. However, as we saw in the ANOVA test and the bar graph, the distribution of listings that are willing to receive vouchers is not equal. Listings with a higher price per square foot had fewer listings that accepted vouchers, which ultimately prevents families from moving into specific neighborhoods. This is further supported by the regression model with price per square foot and percentage of listings that accept Section 8 vouchers where as the price per square foot rises, the percentage of listings that accept Section 8 vouchers fall. Therefore, this shows that families that hold vouchers are still limited to poorer neighborhoods.

Finally, the overall listing quality varies across the city. Listings with information that is helpful for low-income searchers are more concentrated in specific neighborhoods. This further prevents families with vouchers from moving into better-resourced areas. Additionally, low quality listings have repercussions for families with vouchers because of the limited time they have to search for apartments. Low-income renters cannot waste valuable time sifting through low-quality listings, which ultimately means that Craigslist as a rental tool is not meant for them. Policymakers must ensure that low-income families have the tools necessary to find housing in any neighborhood that they desire. Moreover, policymakers should revisit the HCV program and evaluate whether it is meeting the program’s goals

References

“Boston, MA Rent Prices.” Zumper. April 19th, 2023. https://www.zumper.com/rent-research/boston-ma

“Income, Asset, and Price Limits.” Boston Planning & Development Agency. Accessed April 19th, 2023. https://www.bostonplans.org/housing/income-asset-and-price-limits

Irons, Meghan E. “Researchers expected ‘outrageously high’ discrimination against Black renters. What they found was worse than imagined.” July 1, 2020. The Boston Globe. https://www.bostonglobe.com/2020/07/01/metro/blacks-voucher-holders-face-egregious-housing-discrimination-study-says/

“Qualified Renters Need Not Apply.” June 26, 2020. Suffolk University. https://www.suffolk.edu/news-features/news/2020/06/27/01/03/qualified-renters-need-not-apply

DeLuca, Stephanie, Philip M. E. Garboden, and Peter Rosenblatt. “Segregating Shelter: How Housing Policies Shape the Residential Locations of Low-Income Minority Families.” The Annals of the American Academy of Political and Social Science 657, https://www.jstor.org/stable/23479104

Appendix

Listing Quality Score Methodology

To determine the components that make up rental listing quality, I first created a list of guiding questions:

  • Does the rental price make sense for the Boston and Greater Boston area?

  • Does the listing provide a location?

  • Does the listing provide a square footage value, and does it make sense?

  • Is the listing description unusually long or short?

  • Does the listing mention Section 8/Housing Choice Vouchers?

For each guiding question, I aligned it with a variable in the dataset as seen in the table below.

Guiding Question Variable
Does the rental price make sense for the Boston and Greater Boston area? PRICE
Does the listing provide a location? LOCATION
Does the listing provide a square footage value, and does it make sense? AREA_SQFT
Is the listing description unusually short or long? BODY_STRING_COUNT
Does the listing mention Section 8/Housing Choice Vouchers? SECTION_8, NO_SECTION_8

Price Score

For the listing price, I divided the prices into three categories: unrealistically low prices, prices that make sense for low-income families, and prices that are out of reach for low-income families. For the purposes of this document, I will continue this process from the perspective of a low-income family requiring a 2 bedroom apartment. According to HUD home rental limits, the appropriate price range would be between $1,577 and $2,023 in the Boston/Cambridge/Quincy area. Therefore, I will assign a price score of 3 for any listing that is between that range. For listings between $1,315 and $1,577, I assigned the value of 2 because it is the lower limit of 1 bedroom to the lower limit of a 2 bedroom. Although families would ideally prefer to have more space, a one bedroom is still feasible.

For prices above $2,023, the upper limit of a 2 bedroom, I assigned a value of 1 because the prices are likely legitimate, but out of range for low-income families searching for a 2 bedroom apartment. I also assigned a value of 1 for prices between $500 and $1,315. Although it is unlikely that we’d find a listing for a one bedroom or studio for less than $1,315, these prices are legitimate because they could be listings for one bedroom within a multi-bedroom unit. However, this situation is not ideal for an entire family, especially with children. Finally, I assigned a value of 1 to any listings with a price below $500 since they are price outliers and unrealistic for the city of Boston, ultimately indicating a low effort post. The price scores are summarized in the table below.

Price Values Price Score Reasoning
$1,577 - $2,023 3 Within the range of HUD recommended rent limits
$1,315 - $1,577 2 Lower limit of 1 bedroom to lower limit of 2 bedroom
>$2,023 1 Legitimate prices, but out of the range of low-income families
Between $500 and $1,315 1 Legitimate prices, but not feasible for a low-income family
<$500 0 Price outliers for the city of Boston and surrounding areas

Location Score

For the location, I will simply assign the listing a value of 1 if it contains a location and a value of 0 if it does not. This is because while the location is helpful, its information may already be highlighted in the body, which can provide more helpful detail on where the listing is located.

Square Footage Score

In terms of square footage, I assigned the listing a value of 1 if it contained a square footage value that was between 500 square feet and 3,000 square feet. If a listing was considered an outlier (values that did not make sense) or did not have a value, then I assigned it a value of 0. Like the location, this information may already be highlighted in the body with more detail.

Listing Description Score

Although most listings have appropriate descriptions, there are some that do not contain enough detail, and others that contain too much information. As a renter who is trying to rapidly find housing, they do not have the time to sift through too much information, nor do they want to risk trusting a listing that does not provide enough information. Therefore, any listings with less than 15 words and any listings with greater than 1500 words in the body will be given a score of 0, and listings that fall in between those values will receive a score of 1.

Section 8 Mentions

Finally, for listings that mention Section 8 in a positive light (welcoming voucher holders), I will assign them a value of 2. For listings that do not mention Section 8 at all, I will assign a value of 1 since although the brokers/landlords may still approve voucher holders without explicitly mentioning it. For listings that say “No section 8”, I will assign a value of 0 because it is explicitly discriminating against voucher-holders.

Calculating the Index

To calculate the index, I added the different scores as recommended by the Connecticut Data Collaborative and Trinity College Liberal Action Lab.

Back to Homepage